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Abstract. Concurrent systems are notoriously difficult to analyze, and 
technological advances such as weak memory architectures greatly com¬ 
pound this problem. This has renewed interest in partial order seman¬ 
tics as a theoretical foundation for formal verification techniques. Among 
these, symbolic techniques have been shown to be particularly effective 
at finding concurrency-related bugs because they can leverage highly op¬ 
timized decision procedures such as SAT/SMT solvers. This paper gives 
new fundamental results on partial order semantics for SAT / SMT-based 
symbolic encodings of weak memory concurrency. In particular, we give 
the theoretical basis for a decision procedure that can handle a fragment of 
concurrent programs endowed with least fixed point operators. In addi¬ 
tion, we show that a certain partial order semantics of relaxed sequential 
consistency is equivalent to the conjunction of three extensively studied 
weak memory axioms by Alglave et al. An important consequence of this 
equivalence is an asymptotically smaller symbolic encoding for bounded 
model checking which has only a quadratic number of partial order con¬ 
straints compared to the state-of-the-art cubic-size encoding. 


1 Introduction 

Concurrent systems are notoriously difficult to analyze, and technological ad¬ 
vances such as weak memory architectures as well as highly available dis¬ 
tributed services greatly compound this problem. This has renewed interest 
in partial order concurrency semantics as a theoretical foundation for formal 
verification techniques. Among these, symbolic techniques have been shown to 
be particularly effective at finding concurrency-related bugs because they can 
leverage highly optimized decision procedures such as SAT/SMT solvers. This 
paper studies partial order semantics from the perspective of SAT/SMT-based 
symbolic encodings of weak memory concurrency. 

Given the diverse range of partial order concurrency semantics, we link our 
study to a recently developed unifying theory of concurrency by Tony Hoare 
et al. [1], This theory is known as Concurrent Kleene Algebra (CKA) which is 
an algebraic concurrency semantics based on quantales, a special case of the 
fundamental algebraic structure of idempotent semirings. Based on quantales, 
CKA combines the familiar laws of the sequential program operator (;) with a 
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new operator for concurrent program composition (||). A distinguishing feature 
of CKA is its exchange law (U || V);(X || jk) C ( U',X ) || (V',y) that describes 
how sequential and concurrent composition operators can be interchanged. In¬ 
tuitively, since the binary relation C denotes program refinement, the exchange 
law expresses a divide-and-conquer mechanism for how concurrency may be 
sequentially implemented on a machine. The exchange law, together with a uni¬ 
form treatment of programs and their specifications, is key to unifying existing 
theories of concurrency [2]. CKA provides such a unifying theory [3,2] that has 
practical relevance on proving program correctness, e.g. using rely/guarantee 
reasoning [1], Conversely, however, pure algebra cannot refute that a program 
is correct or that certain properties about every program always hold [3,2,4]. 
This is problematic for theoretical reasons but also in practice because todays 
software complexity requires a diverse set of program analysis tools that range 
from proof assistants to automated testing. The solution is to accompany CKA 
with a mathematical model which satisfies its laws so that we can prove as well 
as disprove properties about programs. 

One such well-known model-theoretical foundation for CKA is Pratt's [5] 
and Gischer's [6] partial order model of computation that is constructed from 
labelled partially ordered multisets (pomsets). Pomsets generalize the concept of 
a string in finite automata theory by relaxing the total ordering of the occur¬ 
rence of letters within a string to a partial order. For example, a \\ a denotes a 
pomset that consists of two unordered events that are both labelled with the 
letter a. By partially ordering events, pomsets form an integral part of the ex¬ 
tensive theoretical literature on so-called 'true concurrency', e.g. [7,8,9,10,5,6], 
in which pomsets strictly generalize Mazurkiewicz traces [11], and prime event 
structures [10] are pomsets enriched with a conflict relation subject to certain 
conditions. From an algorithmic point of view, the complexity of the pomset lan¬ 
guage membership (PLM) problem is NP-complete, whereas the pomset language 
containment (PLC) problem is Fl^-complete [12]. 

Importantly, these aforementioned theoretical results only apply to star-free 
pomset languages (without fixed point operators). In fact, the decidability of 
the equational theory of the pomset language closed under least fixed point, 
sequential and concurrent composition operators (but without the exchange 
law) has been only most recently established [13]; its complexity remains an 
open problem [13]. Yet another open problem is the decidability of this equa¬ 
tional theory together with the exchange law [13]. In addition, it is still unclear 
how theoretical results about pomsets may be applicable to formal techniques 
for finding concurrency-related bugs. In fact, it is not even clear how insights 
about pomsets may be combined with most recently studied language-specific 
or hardware-specific concurrency semantics, e.g. [14,15,16,17]. 

These gaps are motivation to reinvestigate pomsets from an algorithmic 
perspective. In particular, our work connects pomsets to a SAT/SMT-based 
bounded model checking technique [18] where shared memory concurrency 
is symbolically encoded as partial orders. To make this connection, we adopt 
pomsets as partial strings (Definition 1) that are ordered by a refinement rela- 


tion (Definition 3) based on Esik's notion of monotonic bijective morphisms [19]. 
Our partial-string model then follows from the standard Hoare powerdomain 
construction where sets of partial strings are downward-closed with respect to 
monotonic bijective morphism (Definition 4). The relevance of this formaliza¬ 
tion for the modelling of weak memory concurrency (including data races) is 
explained through several examples. Our main contributions are as follows: 

1. We give the theoretical basis for a decision procedure that can handle a 
fragment of concurrent programs endowed with least fixed point operators (The¬ 
orem 2). This is accomplished by exploiting a form of periodicity, thereby 
giving a mechanism for reducing a countably infinite number of events to a 
finite number. This result particularly caters to partial order encoding tech¬ 
niques that can currently only encode a finite number of events due to the 
deliberate restriction to quantifier-free first-order logic, e.g. [18]. 

2. We then interpret a particular form of weak memory in terms of certain 
downward-closed sets of partial strings (Definition 11), and show that our 
interpretation is equivalent to the conjunction of three fundamental weak 
memory axioms (Theorem 3), namely 'write coherence', 'from-read' and 
'global read-from' [17]. Since all three axioms underpin extensive experi¬ 
mental research into weak memory architectures [20], Theorem 3 gives deno- 
tational partial order semantics a new practical dimension. 

3. Finally, we prove that there exists an asymptotically smaller quantifier-free 
first-order logic fornnda that has only 0(N 2 ) partial order constraints (The¬ 
orem 4) compared to the state-of-the-art 0(N 3 ) partial order encoding for 
bounded model checking [18] where N is the maximal number of reads and 
writes on the same shared memory address. This is significant because N 
can be prohibitively large when concurrent programs frequently share data. 

The rest of this paper is organized into three parts. First, we recall familiar 
concepts on partial-string theory (§ 2) on which the rest of this paper is based. 
We then prove a least fixed point reduction result (§ 3). Finally, we character¬ 
ize a particular form of relaxed sequential consistency in terms of three weak 
memory axioms by Alglave et al. (§ 4). 

2 Partial-string theory 

In this section, we adapt an axiomatic model of computation that uses partial 
orders to describe the semantics of concurrent systems. For this, we recall famil¬ 
iar concepts (Definition 1, 2, 3 and 4) that underpin our mathematical model of 
CKA (Theorem 1). This model is the basis for subsequent results in § 3 and § 4. 

Definition 1 (Partial string). Denote with E a nonempty set of events. Let T be 
an alphabet. A partial string p is a triple (E p ,ec v , < v ) where E p is a subset of E, 
: Ep —> T is a function that maps each event in E p to an alphabet symbol in T, 
and Ap is a partial order on E p . Two partial strings p and q are said to be disjoint 
whenever E p n Eq = 0. A partial string p is called empty whenever E p — 0. Denote 
with P f the set of all finite partial strings p whose event set Ep is finite. 


e 0 Fig. 1. A partial string p = (Ep,Xp, <p) with events Ep = {eg,ei,e 2 ,e 3 } and 

{ f the labelling function dp satisfying the following: dp(eg) = 'rg : = [(’[acquire’/ 
e \ T3 ftp(ei) = Vi := [a]none'/ = 'Hnone : = 1' anda p (e 3 ) = '[b] re lease : = !'• 


Each event in the universe E should be thought of as an occurrence of a com¬ 
putational step, whereas letters in E describe the computational effect of events. 
Typically, we denote a partial string by p, or letters from x through z. In essence, 
a partial string p is a partially-ordered set (Ep, A p ) equipped with a labelling 
function Kp. A partial string is therefore the same as a labelled partial order (lpo), 
see also Remark 1. We draw finite partial strings in P f as inverted Hasse dia¬ 
grams (e.g. Fig. 1), where the ordering between events may be interpreted as 
a happens-before relation [ 8 ], a fundamental notion in distributed systems and 
formal verification of concurrent systems, e.g. [16,17]. We remark the obvious 
fact that the empty partial string is unique under component-wise equality. 

Example 1. In the partial string in Fig. 1, eg happens-before e\, whereas both eg 
and e 2 happen concurrently because neither t’o rip £’2 nor £’2 Ap eg. 

We abstractly describe the control flow in concurrent systems by adopting 
the sequential and concurrent operators on labelled partial orders [9,5,6,19,21], 


Definition 2 (Partial string operators). Let x and y be disjoint partial strings. Let 
X II y = (E. v || v , a q|;/, — x \\y) and x '’ l J - ( E x-,y,0i x -y,<x-,y) be their concurrent and 
sequential composition, respectively, where E x y — E x -y = E x U E v such that, for 
all events e, e' in E x U Ey, the following holds: 


- e ^ x ||i/ e ' exactly ife A x e' or e ~< y e', 

- £ — X :y e' exactly if (e £ E x and e' £ Ey) or e ~^ x \\y e '> 
a x {e) ife £ E x 

Ky(e) ife £ E y . 


:|| y( e ) — a ^;y( e ) — 


For simplicity, we assume that partial strings can be always made disjoint 
by renaming events if necessary. But this assumption could be avoided by us¬ 
ing coproducts, a form of constructive disjoint union [21]. When clear from the 
context, we construct partial strings directly from the labels in T. 


Example 2. If we ignore labels for now and let p, for all 0 < i < 3 be four 
partial strings which each consist of a single event e t , then (pg; pi) || (p 2 i P 3 ) 
corresponds to a partial string that is isomorphic to the one shown in Fig. 1. 


To formalize the set of all possible happens-before relations of a concurrent 
system, we rely on Esik's notion of monotonic bijective morphism [19]: 

Definition 3 (Partial string refinement). Let x and y be partial strings such that 
x — (E x , a x A x ) and y = (Ey,tx.y,<y). A monotonic bijective morphism from x 
to y, written f:x^ry,isa bijective function f from E x to Ey such that, for all events 
e,e' £ E x , a x (e) = ciy(f(e)), and ife ~< x e', then /(e) < y f(e'). Then x refines y, 
written x C y, if there exists a monotonic bijective morphism f: y —t xfrom y to x. 


Fig. 2. Two partial strings x and y such that x C y pro¬ 
vided all the labels are preserved, e.g. x x (e' 0 ) = Oiy (eg). 
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Remark 1. Partial words [9] and pomsets [5,6] are defined in terms of isomor¬ 
phism classes of lpos. Unlike lpos in pomsets, however, we study partial strings 
in terms of monotonic bijective morphisms [19] because isomorphisms are about 
sameness whereas the exchange law on partial strings is an inequation [21]. 

The purpose of Definition 3 is to disregard the identity of events but retain 
the notion of 'subsumption', cf. [6]. The intuition is that U orders partial strings 
according to their determinism. In other words, x C y for partial strings x and 
y implies that all events ordered in y have the same order in x. 

Example 3. Fig. 2 shows a monotonic bijective morphism from a partial string as 
given in Fig. 1 to an N-shaped partial string that is almost identical to the one 
in Fig. 1 except that it has an additional partial order constraint, giving its N 
shape. One well-known fact about N-shaped partial strings is that they cannot 
be constructed as i;yori || y under any labelling [5]. However, this is not a 
problem for our study, as will become clear after Definition 4. 

Our notion of partial string refinement is particularly appealing for sym¬ 
bolic techniques of concurrency because the monotonic bijective morphism can 
be directly encoded as a first-order logic formula modulo the theory of uninter¬ 
preted functions. Such a symbolic partial order encoding would be fully justi¬ 
fied from a computational complexity perspective, as shown next. 

Proposition 1. Let x and y be finite partial strings in P f. The partial string refine¬ 
ment (PSR) problem — i.e. whether rCy — is NP-complete. 

Proof. Clearly PSR is in NP. The NP-hardness proof proceeds by reduction from 
the PLM problem [12]. Let T* be the set of strings, i.e. the set of finite partial 
strings s such that Us is a total order (for all e, e 1 £ E S/ e Us t’’ or e 1 Us £’)■ Given a 
finite partial string p, let £ p be the set of all strings which refine p; equivalently, 
£ p = {s £ T* | s C p}. So £ p denotes the same as L(p) in [12, Definition 2.2]. 

Let s be a string in T* and P be a pomset over the alphabet F. By Remark 1, 
fix p to be a partial string in P. Thus s refines p if and only if s is a member 
of £p. Since this membership problem is NP-hard [12, Theorem 4.1], it follows 
that the PSR problem is NP-hard. So the PSR problem is NP-complete. □ 

Note that a single partial string is not enough to model mutually exclusive 
(nondeterministic) control flow. To see this, consider a simple (possibly sequen¬ 
tial) system such as if * then P else Q where * denotes nondeterministic 


choice. If the semantics of a program was a single partial string, then we need 
to find exactly one partial string that represents the fact that P executes or Q exe¬ 
cutes, but never both. To model this, rather than using a conflict relation [10], we 
resort to the simpler Hoare powerdomain construction where we lift sequential 
and concurrent composition operators to sets of partial strings. But since we 
are aiming (similar to Gischer [6]) at an over-approximation of concurrent systems, 
these sets are downward closed with respect to our partial string refinement 
ordering from Definition 3. Additional benefits of using the downward closure 
include that program refinement then coincides with familiar set inclusion and 
the ease with which later the Kleene star operators can be defined. 

Definition 4 (Program). A program is a doivmvard-closed set of finite partial strings 
with respect to Q equivalently X C P f is a program whenever fo X = X where 
4-c X = {y e Pf | 3x £ X : y C x}. Denote with P the family of all programs. 

Since we only consider systems that terminate, each partial string x in a 
program X is finite. We reemphasize that the downward closure of such a set 
X can be thought of as an over-approximation of all possible happens-before 
relations in a concurrent system whose instructions are ordered according to 
the partial strings in X. Later on (§ 4) we make the downward closure of partial 
strings more precise to model a certain kind of relaxed sequential consistency. 

Example 4. Recall that N-shaped partial strings cannot be constructed as x;y 
or x || y under any labelling [5]. Yet, by downward-closure of programs, such 
partial strings are included in the over-approximation of all the happens-before 
relations exhibited by a concurrent system. In particular, according to Exam¬ 
ple 3, the downward-closure of the set containing the partial string in Fig. 1 
includes (among many others) the N-shaped partial string shown on the right 
in Fig. 2. In fact, we shall see in § 4 that this particular N-shaped partial string 
corresponds to a data race in the concurrent system shown in Fig. 3. 

It is standard [6,21] to define 0 = 0 and 1 = {±} where _L is the (unique) 
empty partial string. Clearly 0 and 1 form programs in the sense of Definition 4. 
For the next theorem, we lift the two partial string operators (Definition 2) to 
programs in the standard way: 

Definition 5 (Bow tie). Given two partial strings x and y, denote with x x y either 
concurrent or sequential composition of x and y. For all programs X, y in P and 
partial string operators m, X m y = \.^{xny\x^X and y e y} where X || y 
and X-,y are called concurrent and sequential program composition, respectively. 

By denoting programs as sets of partial strings, we can now define Kleene 
star operators (—)H and ( —) ; for iterative concurrent and sequential program 
composition, respectively, as least fixed points (y) using set union (U) as the 
binary join operator that we interpret as the nondeterministic choice of two 
programs. We remark that this is fundamentally different from the pomsets re¬ 
cursion operators in ultra-metric spaces [22]. The next theorem could be then 


summarized as saying that the resulting structure of programs, written &, is 
a partial order model of an algebraic concurrency semantics that satisfies the 
CKA laws [1]. Since CKA is an exemplar of the universal laws of program¬ 
ming [2], we base the rest of this paper on our partial order model of CKA. 

Theorem 1. The structure & = (P, C,U,0,1,;, ||) is a complete lattice, ordered by 


subset ; 

inclusion (i.e. X 

C y exactly if X uy 

— y), such that | and ; form unital 

quantales over U where & satisfies the following: 


{U 

II v);(* II y) C 

: MX) || ( 

y-,y) 

x u {y u z) = {x u y) u z 

XU X - X 



X U0 — 0U X — X 

xuy=yux 



x \\ y = y \\ x 

x | 

\ 1 — 1 \\ X — X 


X-,\ — \)X = X 

X I 

o = 0 II X = 0 



X;0 = 0; X = 0 

X I 

| (yuz) = (x 

ii y) u(* 

II Z) 

X;(yuz) = (X;y) U(X;Z) 

(X 

uy)\\z = {x 

II z)u{y 

II Z) 

(Xuy);Z = (X;Z)U(y;Z) 

X I 

| {y || z) - (x 

II y) II z 


X;(y;Z) = (X;y);Z 

•pi! 

- yX.l U (V || 

X) 


V' =pX.lU{V-,X). 


Proof. The details are in the accompanying technical report of this paper [21]. 

By Theorem 1, it makes sense to call 1 in structure 6 the n -identity program 
where k is a placeholder for either ; or ||. In the sequel, we call the binary 
relation C on P the program refinement relation. 

3 Least fixed point reduction 

This section is about the least fixed point operators ( —) ; and (—)H. Henceforth, 
we shall denote these by (—) N . We show that under a certain finiteness con¬ 
dition (Definition 7) the program refinement problem X M C * can be re¬ 
duced to a bounded number of program refinement problems without least 
fixed points (Theorem 2). To prove this, we start by inductively defining the 
notion of iteratively composing a program with itself under n . 

Definition 6 (/ 2 -iterated-N-program-composition). Let No = N U {0} be the set 

of non-negative integers. For all programs T in P and non-negative integers n in 
No, 'p°' N = 1 — {_!_} is the x -identity program and p>(”+l)-x = p n - P"' n . 

Clearly (—) K is the limit of its approximations in the following sense: 

Proposition 2. For every program V in P, V™ = Un>o V n ' n . 

Definition 7 (Elementary program). A program V in P is called elementary ifV 

is the downward-closed set with respect to C of some finite and nonempty set Q of finite 
partial strings, i.e. V =Lc Q■ The set of elementary programs is denoted by P^. 


An elementary program therefore could be seen as a machine-representable 
program generated from a finite and nonempty set of finite partial strings. This 
finiteness restriction makes the notion of elementary programs a suitable can¬ 
didate for the study of decision procedures. To make this precise, we define the 
following unary partial string operator: 


Definition 8 (n-repeated-x partial string operator). For every non-negative inte¬ 
ger n in No, x°' N = _L is the empty partial string and x(” +1) ' M = x x x"' N . 

Intuitively, p"' N is a partial string that consists of n copies of a partial string 
p, each combined by the partial string operator x. This is formalized as follows: 

Proposition 3. Let n E No he a non-negative integer. Define [0] = 0 and [n +1] = 
{1,.. .,n + 1}. For every partial string x, x n n is isomorphic to y — {Ey,u.y,<y) 
where Ey = E x x [n] such that, for all e,e' E E x and i,i' E [n], the following holds: 

- if'Wis 'll', then (e, i) ~<y (e', i') exactly ifi — i' and e ~< x e', 

- ift*' is then (e, i) (e ', i') exactly ifi < i' or (i = i' and e < x e'), 

- oc y {{e, i )) = ot x {e). 

Definition 9 (Partial string size). The size of a finite partial string p, denoted by 
\p\, is the cardinality of its event set E p . 

For example, the partial string in Fig. 1 has size four. It is obvious that the 
size of finite partial strings is non-decreasing under the ^-repeated- x partial 
string operator from Definition 8 whenever 0 < n. This simple fact is important 
for the next step towards our least fixed point reduction result in Theorem 2: 


Proposition 4 (Elementary least fixed point pre-reduction). For all elementary 
programs X and y in P^, if the x -identity program 1 is not in y and X C 

then X C U i; >jt>o where n = such that Lx — max{|x| | x E X} and 

Ly = min {|y| | y E y} is the size of the largest and smallest partial strings in X and 
y , respectively. 


Proof. Assume X C } !K . Let x E P f be a finite partial string. We can assume 
x E X because X f= 0. By assumption, x E y M . By Proposition 2, there exists 
k E No such that x E y k ' y ~ ■ Fix k to be the smallest such non-negative integer. 

Show k < (the fraction is well-defined because X and y are nonempty 

and 1 ^ J^). By downward closure and definition of E in terms of a one-to-one 
correspondence, it suffices to consider that x is one of a (not necessarily unique) 
longest partial strings in X , i.e. \x'\ < |x| for all x' E X; equivalently, \x\ = ix- 
If |x = 0, set k = 0, satisfying 1 — X C y k ' M — \ and k < n — 0 as required. 
Otherwise, since the size of partial strings in a program can never decrease 
under the k-iterated program composition operator x when 0 < 7c, it suffices to 
consider the case x C y ,: H for some shortest partial string y in y. Since Ey, y . is 
the Cartesian product of E lf and [k], it follows |x| = k ■ \y\. Since |x| < Lx an d 
Ly < |y|, k < \jf^\ ■ By definition n — , proving x E U n >k>0 3^' K - □ 





Equivalently, if there exists a partial string x in X such that x f y k ' x for all 


non-negative integers k between zero and -f- , then X f y*. Since we are 


interested in decision procedures for program refinement checking, we need to 
show that the converse of Proposition 4 also holds. Towards this end, we prove 
the following left ( —) M elimination rule: 


Proposition 5. For every program X and y in V, X n C exactly if X C . 


Proof. Assume X™ C y N . By Proposition 2, X C X M . By transitivity of C in 
P, X C T ,x ■ Conversely, assume X C Let i,j G No- By induction on i, 
X l ' n x X 1 ' x = X ! ' + !■" x . Thus, by Proposition 2 and distributivity of x over 
least upper bounds in P, X n x X K = X™, i.e. ( — ) N is idempotent. This, 
in turn, implies that (—) K is a closure operator. Therefore, by monotonicity, 
X* C = y N , proving that X N C y* 1 is equivalent to X C y*. □ 


Theorem 2 (Elementary least fixed point reduction). For all elementary programs 
X and y in P^, if the x -identity program 1 is not in y, then X n C is equiva¬ 


lent to X C U(i»t>o y c ' M where n = ^-J such that lx = max {|x| | x G X} and 

ly = min {|y | | y G y} is the size of the largest and smallest partial strings in X and 
y, respectively. 


Proof. By Proposition 5, it remains to show that X C y K is equivalent to X C 

. The forward and backward implication follow 
from Proposition 4 and 2, respectively. □ 

From Theorem 2 follows immediately that X™ C is decidable for all 
elementary programs X and y in P^ : because there exists an algorithm that 
could iteratively make O (| TC | x \y \ l! ) calls to another decision procedure to 
check whether x C y for all x G X and y G y k ' x where n > k > 0. However, 
by Proposition 1, each iteration in such an algorithm would have to solve an 
NP-complete subproblem. But this high complexity is expected since the PLC 
problem is n^-complete [12]. 

Corollary 1. For all elementary programs X and 3^ in P, if |x| = |y|/or all x G X 
and y & y, then X™ C y M is equivalent to X C y. 


Un>k>oy k ' M where n = 


We next move on to enriching our model of computation to accommodate a 
certain kind of relaxed sequential consistency. 


4 Relaxed sequential consistency 

For efficiency reasons, all modern computer architectures implement some form 
of weak memory model rather than sequential consistency [23]. A defining 
characteristic of weak memory architectures is that they violate interleaving se¬ 
mantics unless specific instructions are used to restore sequential consistency. 





Thread iq Thread T 2 


^0 • Tjacquire 
Tl : = [fl]none 


Fig. 3. A concurrent system Tj || T 2 consisting of two 
[fl] none : = 1 threads. The memory accesses on memory locations 
[fa] release : = 1 b are synchronized, whereas those on a are not. 


This section fixes a particular interpretation of weak memory and studies the 
mathematical properties of the resulting partial order semantics. For this, we 
separate memory accesses into synchronizing and non-synchronizing ones, akin 
to [24], A synchronized store is called a release, whereas a synchronized load is 
called an acquire. The intuition behind release/acquire is that prior writes made 
to other memory locations by the thread executing the release become visible 
in the thread that performs the corresponding acquire. Crucially, the particular 
form of release / acquire semantics that we formalize here is shown to be equiv¬ 
alent to the conjunction of three weak memory axioms (Theorem 3), namely 
'write coherence', 'from-read' and 'global read-from' [17]. Subsequently, we 
look at one important ramification of this equivalence on bounded model checking 
(BMC) techniques for finding concurrency-related bugs (Theorem 4). 

We start by defining the alphabet that we use for identifying events that 
denote synchronizing and non-synchronizing memory accesses. 

Definition 10 (Memory access alphabet). Define (LOAD) = {none, acquire}, 
(STORE) = {none, release} and (BIT) = {0,1}. Let (ADDRESS) and (REG) be 
disjoint sets of memory locations and registers, respectively. Let load Jag £ (LOAD) 
and store Jag £ (STORE). Define the set of load and store labels, respectively: 

rioac ijoadjag - {loadJag} x (REG) x (ADDRESS) 

F : store, store jag - {store Jag} X (ADDRESS) X (BIT) 

Let r ^load,none C Cload.acquire C Tstore.rione U F s ^ ore re | ease be the memory ac¬ 
cess alphabet. Given r £ (REG), a £ (ADDRESS) and b £ (BIT), we ivrite 
‘r ■ = [fl] load Jag ' for the label (loadJag,r,a) in T\ oad/loadJag ; similarly, ‘[a\ st0 reJag ■=&' 
is shorthand for the label (store Jag, a, fa) in r stor e, store jag- 

Let xbe a partial string and e be an event in E x . Then e is called a load or store if 
its label, ct x (e), is in T\ oad j oadJag or T st0 re, store jag, respectively. A load or store event e 

is a non-synchronizing memory access if a x (e) £ r none = F| oad none U r st0 re,none; 
otherwise, it is a synchronizing memory access. Let a £ (ADDRESS) be a memory 
location. An acquire on a is an event e such that ot x (e) — ‘r : = [a] acqu \ re ' for some 
r £ (REG). Similarly, a release on a is an event e labelled by '[fl] re | e ase :=b' for some 
fa £ (BIT). A release and acquire is a release and acquire on some memory location, 
respectively. 

Example 5. Fig. 3 shows the syntax of a program that consists of two threads Ti 
and T 2 . This concurrent system can be directly modelled by the partial string 
shown in Fig. 1 where memory location fa is accessed through acquire and re¬ 
lease, whereas memory location a is accessed through non-synchronizing loads 
and stores (shortly, we shall see that this leads to a data race). 





Given Definition 10, we are now ready to refine our earlier conservative 
over-approximation of the happens-before relations (Definition 4) to get a par¬ 
ticular form of release/acquire semantics. For this, we restrict the downward 
closure of programs X in P, in the sense of Definition 4, by requiring all partial 
strings in X to satisfy the following partial ordering constraints: 

Definition 11 (SC-relaxed program). A program X is called SC-relaxed if, for all 

a E ( ADDRESS ) and partial string x in X, the set of release events on a is totally 
ordered by < x and, for every acquire l e E x and release s G E x on a, l A x s or s ~< x l. 

Henceforth, we denote loads and stores by l, V and s, s', respectively. If s and 
s' are release events that modify the same memory location, either s happens- 
before s', or vice versa. If l is an acquire and s is a release on the same memory 
location, either l happens-before s or s happens-before l. Importantly, however, 
two acquire events l and l' on the same memory location may still happen con¬ 
currently in the sense that neither l happens-before l' nor V happens-before l, 
in the same way non-synchronizing memory accesses are generally unordered. 

Example 6. Example 4 and 5 illustrate the SC-relaxed semantics of the concur¬ 
rent system in Fig. 3. In particular, the N-shaped partial string in Fig. 2 cor¬ 
responds to a data race in Ti || T 2 because the non-synchronizing memory 
accesses on memory location a happen concurrently. To see this, it may help 
to consider the interleaving r 0 : = [b] acqu i re ; [a] none : = 1; H : = [a] none! [^release : = 1 
where both memory accesses on location a are unordered through the happens- 
before relation because there is no release instruction separating [a] non e : = 1 
from r\ : = [a] none- One way of fixing this data race is by changing thread Ti to 
if [fr] aC qui r e = 1 then r\ : = [A] none . Since CKA supports non-deterministic choice 
with the U binary operator (recall Theorem 1), it would not be difficult to give 
semantics to such conditional checks, particularly if we introduce 'assume' la¬ 
bels into the alphabet in Definition 10. 

We ultimately want to show that the conjunction of three existing weak 
memory axioms as studied in [17] fully characterizes our particular interpre¬ 
tation of relaxed sequential consistency, thereby paving the way for Theorem 4. 
For this, we recall the following memory axioms which can be thought of as 
relations on loads and stores on the same memory location: 

Definition 12 (Memory axioms). Let x be a partial string in P f. The read-from 

function, denoted by rf: E x —>■ E X/ is defined to map every load to a store on the same 
memory location. A load l synchronizes-with a store s if rf (/) = s implies s A x /. 
Write-coherence means that all stores s, s' on the same memory location are totally 
ordered by A x . The from-read axiom holds whenever, for all loads l and stores s, s' on 
the same memory location, if rf (Z) = s and s -< x s', then l A x s'. 

By definition, the read-from function is total on all loads. The synchronizes- 
with axiom says that if a load reads-from a store (necessarily on the same mem¬ 
ory location), then the store happens-before the load. This is also known as the 
global read-from axiom [17]. Write-coherence, in turn, ensures that all stores on 


the same memory location are totally ordered. This corresponds to the fact that 
"all writes to the same location are serialized in some order and are performed 
in that order with respect to any processor" [24], Note that this is different from 
the modification order ('mo') on atomics in C++14 [25] because 'mo' is generally 
not a subset of the happens-before relation. The from-read axiom [17] requires 
that, for all loads Z and two different stores s, s' on the same location, if Z reads- 
from s and s happens-before s', then ? happens-before s'.We start by deriving 
from these three memory axioms the notion of SC-relaxed programs. 

Proposition 6 (SC-relaxed consistency). For all X in P, if, for each partial string 
x in X, the synchronizes-with, write-coherence and from-read axioms hold on all re¬ 
lease and acquire events in E x on the same memory location, then X is an SC-relaxed 
program. 

Proof. Let a G ( ADDRESS ) be a memory location, l be an acquire on a and s' 
be a release on a. By write-coherence on release/acquire events, it remains to 
show l A x s' or s' ~< x l. Since the read-from function is total, rf (Z) = s for some 
release s on a. By the synchronizes-with axiom, s ~< x l- We therefore assume 
s f s'. By write-coherence, s -< x s' or s' -< x s. The former implies l A x s' by the 
from-read axiom, whereas the latter implies s' A x 1 by transitivity. This proves, 
by case analysis, that X is an SC-relaxed program. □ 

We need to prove some form of converse of the previous implication in order 
to characterize SC-relaxed semantics in terms of the three aforementioned weak 
memory axioms. For this purpose, we define the following: 

Definition 13 (Read consistency). Let a G ( ADDRESS ) be a memory location and 
x be a finite partial string in Pf. For all loads l G E x on a, define the following set of 

store events: FL x (l) = {s G E x \ s < x l and s is a store on a}. The read-from function 
rf is said to satisfy weak read consistency whenever, for all loads l G E x and stores 
s G E x on memory location a, the least upper bound V T~Lx(l) exists, and rf(7) = s 
implies V T~L x (l) dix s; strong read consistency implies rf (Z) = s = V 'H.r(Z). 

By the next proposition, a natural sufficient condition for the existence of the 
least upper bound V T~L x (Z) is the finiteness of the partial strings in P f and the 
total ordering of all stores on the same memory location from which the load Z 
reads, i.e. write coherence. This could be generalized to well-ordered sets. 

Proposition 7 (Weak read consistency existence). For all partial strings x in Pf, 
write coherence on memory location a implies that V 'Hx (Z) exists for all loads l on a. 

We remark that V ?7*(Z) = _L if TL X {1) — 0; alternatively, to avoid that FL X {1) 
is empty, we could require that programs are always constructed such that their 
partial strings have minimal store events that initialize all memory locations. 

Proposition 8 (Weak read consistency equivalence). Write coherence implies that 
weak read consistency is equivalent to the following: for all loads l and stores s,s' on 
memory location a G (ADDRESS), z/rf(Z) = s and s' ~< x l, then s' < x s. 


Proof. By write coherence, V 77 x (/) exists, and s' < x V 77 X (Z) because s' G 77 X (Z) 
by assumption s' -< x l and Definition 13. By assumption of weak read consis¬ 
tency, V 77 X (Z) dix s. From transitivity follows s' ~< x s. 

Conversely, assume rf (7) — s. Let s' be a store on a such that s' G 77 x (/). 
Thus, by hypothesis, s' dix s. Since s' is arbitrary, s is an upper bound. Since the 
least upper bound is well-defined by write coherence, V 77 X (Z) ;k x s. □ 

Weak read consistency therefore says that if a load l reads from a store s and 
another store s' on the same memory location happens before 1, then s' happens 
before s. This implies the next proposition. 

Proposition 9 (From-read equivalence). For all SC-relaxed programs in P, weak 
read consistency with respect to release/acqidre events is equivalent to the from-read 
axiom with respect to release/acquire events. 

We can characterize strong read consistency as follows: 

Proposition 10 (Strong read consistency equivalence). Strong read consistency 
is equivalent to weak read consistency and the synchronizes-with axiom. 

Proof. Let x be a partial string in P f. Let l be a load and s be a store on the same 
memory location. The forward implication is immediate from V 77.v (l) Tzx k 
Conversely, assume rf (7) = s. By synchronizes-with, s ;k x l, whence s G 
77, v (7). By definition of least upper bound, s rix V 77 X (Z). Since s ^ x V 77 x (7), 
by hypothesis, and ;k x is antisymmetric, we conclude s — V 77 x (7). □ 

Theorem 3 (SC-relaxed equivalence). For every program X in P, X is SC-relaxed 
where, for all partial strings x in X and acquire events l in E x , rf (7) = V 77 X (Z), if 
and only if the synchronizes-ivith, write-coherence and from-read axioms hold for all x 
in X with respect to all release/acquire events in E x on the same memory location. 

Proof. Assume X is an SC-relaxed program according to Definition 11. Let x be 
a partial string in X and l be an acquire in the set of events E x . By Proposition 7, 

V 77 x (7) exists. Assume rf(7) = V 77 x (/). Since l is arbitrary, this is equivalent 
to assuming strong read consistency. Since release events are totally ordered in 
< X/ by assumption, it remains to show that the synchronizes-with and from- 
read axioms hold. This follows from Proposition 10 and 9, respectively. 

Conversely, assume the three weak memory axioms hold on x with respect 
to all release/acquire events in E x on the same memory location. By Proposi¬ 
tion 6, X is an SC-relaxed program. Therefore, by Proposition 9 and 10, rf (7) = 

V 77 x (/), proving the equivalence. □ 

While the state-of-the-art weak memory encoding is cubic in size [18], the 
previous theorem has as immediate consequence that there exists an asymptot¬ 
ically smaller weak memory encoding with only a quadratic number of partial 
order constraints. 

Theorem 4 (Quadratic-size weak memory encoding). There exists a quantifier- 
free first-order logic fornuda that has a quadratic number of partial order constraints 
and is eqidsatisfiable to the cubic-size encoding given in [18]. 


Proof. Instead of instantiating the three universally quantified events in the 
from-read axiom, symbolically encode the least upper bound of weak read con¬ 
sistency. This can be accomplished with a new symbolic variable for every ac¬ 
quire event. It is easy to see that this reduces the cubic number of partial order 
constraints to a quadratic number. □ 

In short, the asymptotic reduction in the number of partial order constraints 
is due to a new symbolic encoding for how values are being overwritten in 
memory: the current cubic-size formula [18] encodes the from-read axiom (Def¬ 
inition 12), whereas the proposed quadratic-size formula encodes a certain least 
upper bound (Definition 13). We reemphasize that this formulation is in terms 
of release/acquire events rather than machine-specific accesses as in [18]. The 
construction of the quadratic-size encoding, therefore, is generally only appli¬ 
cable if we can translate the machine-specific reads and writes in a shared mem¬ 
ory program to acquire and release events, respectively. This may require the 
program to be data race free, as illustrated in Example 6. 

Furthermore, as mentioned in the introduction of this section, the primary 
application of Theorem 4 is in the context of BMC. Recall that BMC assumes 
that all loops in the shared memory program under scrutiny have been unrolled 
(the same restriction as in [18]). This makes it possible to symbolically encode 
branch conditions, thereby alleviating the need to explicitly enumerate each 
finite partial string in an elementary program. 

5 Concluding remarks 

This paper has studied a partial order model of computation that satisfies the 
axioms of a unifying algebraic concurrency semantics by Hoare et al. By fur¬ 
ther restricting the partial string semantics, we obtained a relaxed sequential 
consistency semantics which was shown to be equivalent to the conjunction of 
three weak memory axioms by Alglave et al. This allowed us to prove the exis¬ 
tence of an equisatisfiable but asymptotically smaller weak memory encoding 
that has only a quadratic number of partial order constraints compared to the 
state-of-the-art cubic-size encoding. In upcoming work, we will experimentally 
compare both encodings in the context of bounded model checking using SMT 
solvers. As future theoretical work, it would be interesting to study the relation¬ 
ship between categorical models of partial string theory and event structures. 
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